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The invention is described in the following statement: 
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A NOVEL GENE AND USES THEREFOR 



5 The present invention relates generally to a novel human gene and to derivatives and 
mammalian, animal, insect, nematodes, avian and microbial homologues thereof. The 
present invention further provides pharmaceutical compositions and diagnostic agents as well 
as genetic molecules useful in gene replacement therapy and recombinant molecules useful 
in protein replacement therapy. 

10 

Throughout this specification and the claims which follow, unless the context requires otherwise, 
the word "comprise", or variations such as •'comprises" or "comprising", will be understood to 
imply the inclusion of a stated integer or group of integers but not the exclusion of any other 
integer or group of integers. 

15 

Sequence Identity Numbers (SEQ ID NOs.) for the nucleotide and amino acid sequences referred 
to in the specification are defined at the end of the description. 

The increasing sophistication of recombinant DNA technology is greatly facilitating research 
20 and development in the medical and allied health fields. There is growing need to develop 
recombinant and genetic molecules for use in diagnosis, conventional pharmaceutical 
preparations as well as gene and protein replacement therapies. 

In work leading up to the present invention, the inventors sought to identify and clone human 
25 genes which might be useful as potential diagnostic and/or therapeutic agents. One area of 
particular interest is in the field of gene regulators. 

Gene expression generally requires interaction between a regulatory protein and an 
appropriate recognition sequence of a target gene. Regulatory proteins comprise in many 
30 cases a domain or motif that facilitates binding to DNA. One particular motif comprises 



# 
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small sequence units repeated in tandem with each unit folded about a zinc atom to fomi 
separate structural domains. This motif is now referred to as a zinc finger domain. Such a 
domain is generally defined by the number of cysteine (C) and histidine (H) residues. 

5 In accordance with the present invention, a gene has been identified from the human genome 
with an N-terminal region resembling a zinc-finger domain of a novel type. 

Accordingly, one aspect of the present invention contemplates an isolated nucleic acid 
molecule comprising a sequence of nucleotides encoding or complementary to a sequence 
10 encoding an amino acid sequence having homology to a regulator of gene expression or a 
derivative of said gene regulator. 

More particularly the present invention provides an isolated nucleic acid molecule comprising 
a sequence of nucleotides encoding or complementary to a sequence encoding putative 
1 5 regulator of gene expression wherein said regulator comprises a zinc finger domain of an 
(HQ)^ type. 

Even more particularly, the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 



20 



(i) 



a nucleotide sequence set forth in SEQ ID NO:l; 



(ii) 



(iii) 



a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:2; 
a nucleotide sequence having at least about 40% similarity to the nucleotide 
sequence of (i) or (ii); and 



25 (iv) 



a nucleotide sequence capable of hybridizing under low stringency conditions to 
the nucleotide sequence set forth in (i), (ii) or (iii). 



In a related embodiment, the present invention provides an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 
30 " 
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(i) a nucleotide sequence set forth in SEQ ID N0:3; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO: 4; 
.(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide 

sequence of (i) or (ii); and 
5 (iv) a nucleotide sequence capable of hybridizing under low stringency conditions to 

the nucleotide sequence set forth in (i), (ii) or (iii). 

Preferably, the percentage similarity is at least about 50%. More preferably, the percentage 
similarity is at least about 60%. 

10 

Reference herein to a low stringency at 42**C includes and encompasses from at least about 1% 
v/v to at least about 15% v/v formamide and from at least about IM to at least about 2M salt for 
hybridisation, and at least about IM to at least about 2M salt for washing conditions. Alternative 
stringency conditions may be applied where necessary, such as medium stringency, which 

1 5 includes and encompasses from at least about 16% v/v to at least about 30% v/v formamide and 
from at least about 0.5M to at least about 0.9M salt for hybridisation, and at least about 0.5M 
to at least about 0.9M salt for washing conditions, or high stringency, which includes and 
encompasses from at least about 3 1% v/v to at least about 50% v/v formamide and from at least 
about 0.0 IM to at least about 0. 1 5M salt for hybridisation, and at least about 0.0 IM to at least 

20 about 0. 1 5M salt for washing conditions. 

The term "similarity" as used herein includes exact identity between compared sequences at the 
nucleotide or amino acid level. Where there is non-identity at the nucleotide level, "similarity" 
includes differences between sequences which result in different amino acids that are nevertheless 
25 related to each other at the structural, ftinctional, biochemical and/or conformational levels. 
Where there is non-identity at the amino acid level, "similarity" includes amino acids that are 
nevertheless related to each other at the structural, fijnctional, biochemical and/or conformational 
levels. 



30 The present invention extends to nucleic acid molecules with percentage similarities of 
approximately 65%, 70%), 75%i, 80%), 85%), 90%o or 95%» or above or a percentage in between. 
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The nucleic acid molecule of the present invention is hereinafter referred to as constituting the 
"mcg4'* gene. The protein encoded by mcg4 is referred to herein as "MCG4". The mcg4 
gene is proposed to encode, in accordance with the present invention, a regulator of gene 
expression and to comprise the novel zinc finger domain (HC3)2. A regulator of gene 
5 expression includes a transcription factor. Regulation may be at the level of nucleic 
acid:protein or protein: protein interaction. 

The present invention extends to the naturally occurring genomic mcg4 nucleotide sequence 
or corresponding cDNA sequence or to derivatives thereof. Derivatives contemplated in the 

10 present invention include fragments, parts, portions, mutants, homologues and analogues of 
MCG4 or the corresponding genetic sequence. Derivatives also include single or multiple 
amino acid substitutions, deletions and/or additions to MCG4 or single or multiple nucleotide 
substitutions, deletions and/or additions to mcg4, "Additions" to the amino acid or nucleotide 
sequences include fusions with other peptides, polypeptides or proteins or fusions to 

15 nucleotide sequences. Reference herein to "MCG4" or "mcg4" includes references to all 
derivatives thereof including functional derivatives and immunologically interactive 
derivatives of MCG4. 

The mcg4 of the present invention is particularly exemplified herein from humans and in 
20 particular from human chromosome 1 lql3- 

The present invention extends, however, to a range of homologues from, for example, 
primates, livestock animals (eg. sheep, cows, horses, donkeys, pigs), companion animals (eg. 
dogs, cats) laboratory test animals (eg. rabbits, mice, rats, guinea pigs), birds (eg. chickens, 
25 ducks, geese, parrots), insects, nematodes, eukaryotic microorganisms and captive wild 
animals (eg. deer, foxes, kangaroos). Reference herein to mcg4 or MCG4 includes reference 
to these molecules of human origin as well as novel forms of non-human origin. 

The nucleic acid molecules of the present invention may be DNA or RNA, When the nucleic 
30 acid molecule is in DNA form, it may be genomic DNA or cDNA. RNA forms of the nucleic 
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acid molecules of the present invention are generally mRNA. 

Although the nucleic acid molecules of the present invention are generally in isolated form, they, 
may be integrated into or ligated to or otherwise fused or associated with other genetic 
5 molecules such as vector molecules and in particular expression vector molecules. Vectors and 
expression vectors are generally capable of replication and, if applicable, expression in one or 
both of a prokaryotic cell or a eukaryotic cell. Preferably, prokaryotic cells include E, coli. 
Bacillus sp and Pseudomonas sp. Preferred eukaryotic cells include yeast, fungal, mammalian 
and insect cells. 

10 

Accordingly, another aspect of the present invention contemplates a genetic construct comprising 
a vector portion and an animal, more particularly a mammalian and even more particularly a 
human n7Cg4 gene portion, which n2Cg4 gene portion is capable of encoding an MCG4 
polypeptide or a functional or immunologically interactive derivative thereof. 

15 

Preferably, the mcg4 gene portion of the genetic construct is operably linked to a promoter on 
the vector such that said promoter is capable of directing expression of said mcg4 gene portion 
in an appropriate cell. 

20 In addition, the n7cg4 gene portion of the genetic construct may comprise all or part of the gene 
fused to another genetic sequence such as a nucleotide sequence encoding glutathione-S- 
transferase or part thereof 

The present invention extends to such genetic constructs and to prokaryotic or eukaryotic cells 
25 comprising same. 

It is proposed in accordance with the present invention that MCG4 is a transcription factor 
involved in gene regulation. Mutations in mcg4 may result in aberrations in gene regulation 
leading to the development of or a propensity to develop various types of cancer. In this 
30 regard, although not wishing to limit the present invention to any one hypothesis or mode of 
action, it is proposed that mcg4 or its expression product may be involved in the tissue- 
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specific or temporal regulation of particular genes. 

A deletion or aberration in the mcg4 gene may also be important in the detection of cancer 
or a propensity to develop cancer. An aberration may be a homozygous mutation or a 
5 heteros^'gous mutation. The detection may occur at the foetal or post-natal level. Detection 
may also be at the germline or somatic cell level. Furthermore, a risk of developing cancer 
may be determined by assaying for aberrations in the parents and/or proband of a subject 
under investigation. 

10 According to this aspect of the present invention, there is contemplated a method of detecting 
a condition caused or facilitated by an aberration in mcg4, said method comprising 
determining the presence of a single or multiple nucleotide substitution, deletion and/or 
addition or other aberration to one or both alleles of said mcg4 wherein the presence of such 
a nucleotide substitution, deletion and/or addition or other aberration may be indicative of 

15 said condition or a propensity to develop said condition. 

The nucleotide substitutions, additions or deletions may be detected by any convenient means 
including nucleotide sequencing, restriction fragment length polymorphism (RFLP), 
polymerase chain reaction (PCR), oligonucleotide hybridization and single stranded 
20 conformation polymorphism analysis (SSCP) amongst many others. An aberration includes 
modification to existing nucleotides such as to modify glycosylation signal amongst other 
effects. 

In an alternative method, aberrations in the mcg4 gene are detected by screening for mutations 
25 in MCG4. 

A mutation in MCG4 may be a single or multiple amino acid substitution, addition and/or 
deletion. The mutation in mcg4 may also result in either no translation product being 
produced or a product in truncated form. A mutant may also be an altered glycosylation 
30 pattern or the introduction of side chain modifications to amino acid residues. 
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According to this aspect of the present invention, there is provided a method of detecting a 
condition caused or faciUtated by an aberration in mcg4, said method comprising screening 
for a single or multiple amino acid substitution, deletion and/or addition to MCG4 wherein 
the presence of such a mutation is indicative of or a propensity to develop said condition. 

5 

A particularly convenient means of detecting a mutation in MCG4 is by use of antibodies. 

Accordingly another aspect of the present invention is directed to antibodies to MCG4 and its 
derivatives. Such antibodies may be monoclonal or polyclonal and may be selected from 
10 naturally occurring antibodies to MCG4 or may be specifically raised to MCG4 or derivatives 
thereof In the case of the latter, MCG4 or its derivatives may first need to be associated with 
a carrier molecule. The antibodies to MCG4 of the present invention are particularly useftil as 
diagnostic agents. 

1 5 For example, antibodies to MCG4 and its derivatives can be used to screen for wild-type MCG4 
or for mutated MCG4 molecules. The latter may occur, for example, during or prior to certain 
cancer development. A differential binding assay is also particularly useful. Techniques for such 
assays are well known in the art and include, for example, sandwich assays and ELISA. 
Knowledge of normal MCG4 levels or the presence of wild-type MCG4 may be important for 

20 diagnosis of certain cancers or a predisposition for development of cancers or for monitoring 
certain therapeutic protocols. 

As stated above antibodies to MCG4 of the present invention may be monoclonal or polyclonal 
or may be fragments of antibodies such as Fab fragments. Furthermore, the present invention 
25 extends to recombinant and synthetic antibodies and to antibody hybrids. A "synthetic antibody" 
is considered herein to include fragments and hybrids of antibodies. 

For example, specific antibodies can be used to screen for wild-type MCG4 molecule or specific 
mutant molecules such as molecules having a certain deletion. This would be important, for 
30 example, as a means for screening for levels of MCG4 in a cell extract or other biological fluid 
or purifying MCG4 made by recombinant means from culture supernatant fluid or purified from 
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a cell extract. Techniques for the assays contemplated herein are known in the art and include, 
for example, sandwich assays and ELISA. 

It is within the scope of this invention to include any second antibodies (monoclonal, polyclonal 
5 or fragments of antibodies or synthetic antibodies) directed to the first mentioned antibodies 
discussed above. Both the first and second antibodies may be used in detection assays or a first 
antibody may be used with a commercially available anti-immunoglobulin antibody. An antibody 
as contemplated herein includes any antibody specific to any region of wild-type MCG4 or to a 
specific mutant phenotype or to a deleted or otherwise altered region. 

10 

Both polyclonal and monoclonal antibodies are obtainable by immunization of a suitable animal 
or bird with MCG4 or its derivatives and either type is utilizable for immunoassays. The 
methods of obtaining both types of sera are well known in the art. Polyclonal sera are less 
preferred but are relatively easily prepared by injection of a suitable laboratory animal or bird 
15 with an eflFective amount of MCG4 or antigenic parts thereof or derivatives thereof, collecting 
serum from the animal or bird, and isolating specific sera by any of the known immunoadsorbent 
techniques. Although antibodies produced by this method are utilizable in virtually any type of 
immunoassay, they are generally less favoured because of the potential heterogeneity of the 
product. 

20 

The use of monoclonal antibodies in an immunoassay is particularly preferred because of the 
ability to produce them in large quantities and the homogeneity of the product. The preparation 
of hybridoma cell lines for monoclonal antibody production derived by fusing an immortal cell 
line and lymphocytes sensitized against the immunogenic preparation can be done by techniques 
25 which are well known to those who are skilled in the art. 

Another aspect of the present invention contemplates a method for detecting MCG4 or a 
derivative thereof in a biological sample said method comprising contacting said biological 
sample with an antibody specific for MCG4 or its derivatives or homologues for a time and under 
30 conditions sufficient for an antibody-MCG4 complex to form, and then detecting said complex. 
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Preferably, the biological sample is a cell extract from a human or other animal or a bird. 

The presence of MCG4 may be accomplished in a number of ways such as by Western blotting 
and ELIS A procedures. A wide range of immunoassay techniques are available as can be seen 
5 by reference to US Patent Nos. 4,016,043, 4, 424,279 and 4,018,653. These include both single- 
site and two-site or "sandwich" assays of the non-competitive types, as well as traditional 
competitive binding assays. These assays also include direct binding of a labelled antibody to a 
target. 

1 0 Sandwich assays are among the most useful and commonly used assays and are favoured for use 
in the present invention. A number of variations of the sandwich assay technique exist, and all 
are intended to be encompassed by the present invention. Briefly, in a typical forward assay, an 
unlabelled antibody is immobilized on a solid substrate and the sample to be tested brought into 
contact with the bound molecule. After a suitable period of incubation, for a period of time 

1 5 sufficient to allow formation of an antibody-antigen complex, a second antibody specific to the 
antigen, labelled with a reporter molecule capable of producing a detectable signal is then added 
and incubated, allowing time sufficient for the formation of another complex of antibody-antigen- 
labelled antibody. Any unreacted material is washed away, and the presence of the antigen is 
determined by observation of a signal produced by the reporter molecule. The results may either 

20 be qualitative, by simple observation of the visible signal, or may be quantitated by comparing 
with a control sample containing known amounts of hapten. Variations on the forward assay 
include a simultaneous assay, in which both sample and labelled antibody are added 
simultaneously to the bound antibody. These techniques are well known to those skilled in the 
art, including any minor variations as will be readily apparent. In accordance with the present 

25 invention the sample is one which might contain MCG4 including cell extract or, tissue biopsy. 
The sample is, therefore, generally a biological sample comprising biological fluid but also 
extends to fermentation fluid and supernatant fluid such as from a cell culture. 

In the typical forward sandwich assay, a first antibody having specificity for the MCG4 or an 
30 antigenic part thereof or a derivative thereof or antigenic parts thereof, is either covalently or 
passively bound to a solid surface. The solid surface is typically glass or a polymer, the most 
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commonly used polymers being ceUulose, polyacrylamide, nylon, polystyrene, polyvinyl chloride 
or polypropylene. The solid supports may be in the form of tubes, beads, discs of microplates, 
or any other surface suitable for conducting an immunoassay. The binding processes are well- 
known in the art and generally consist of cross-linking covalently binding or physically adsorbing, 

5 the polymer-antibody complex is washed in preparation for the test sample. An aliquot of the 
sample to be tested is then added to the solid phase complex and incubated for a period of time 
sufficient (e.g. 2-40 minutes) and under suitable conditions (e.g. 25^*0) to allow binding of any 
subunit present in the antibody. Following the incubation period, the antibody subunit solid 
phase is washed and dried and incubated With a second antibody specific for a portion of the 

10 hapten. The second antibody is linked to a reporter molecule which is used to indicate the 
binding of the second antibody to the hapten. 

An alternative method involves immobilizing the target molecules in the biological sample and 
then exposing the immobilized target to specific antibody which may or may not be labelled with 
15 a reporter molecule. Depending on the amount of target and the strength of the reporter 
molecule signal, a bound target may be detectable by direct labelling with the antibody. 
Alternatively, a second labelled antibody, specific to the first antibody is exposed to the target- 
first antibody complex to form a target-first antibody-second antibody tertiary complex. The 
complex is detected by the signal emitted by the reporter molecule. 

20 

By "reporter molecule" as used in the present specification, is meant a molecule which, by its 
chemical nature, provides an analytically identifiable signal which allows the detection of antigen- 
bound antibody. Detection may be either qualitative or quantitative. The most commonly used 
reporter molecules in this type of assay are either enzymes, fluorophores or radionuclide 
25 containing molecules (i.e. radioisotopes) and chemiluminescent molecules. 

In the case of an enzyme immunoassay, an enzyme is conjugated to the second antibody, 
generally by means of glutaraldehyde or periodate. As will be readily recognized, however, a 
wide variety of different conjugation techniques exist, which are readily available to the skilled 
artisan. Commonly used enzymes include horseradish peroxidase, glucose oxidase, beta- 
30 galactosidase and alkaline phosphatase, amongst others. The substrates to be used with the 
specific enzymes are generally chosen for the production, upon hydrolysis by the corresponding 
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enzyme, of a detectable colour change. Examples of suitable enzymes include alkaline 
phosphatase and peroxidase. It is also possible to employ fluorogenic substrates, which yield a 
fluorescent product rather than the chromogenic substrates noted above. In all cases, the 
enzyme-labelled antibody is added to the first antibody hapten complex, allowed to bind, and 
5 then the excess reagent is washed away. A solution containing the appropriate substrate is then 
added to the complex of antibody-antigen-antibody. The substrate will react with the enzyme 
linked to the second antibody, giving a qualitative visual signal, which may be fiarther quantitated, 
usually spectrophotometrically, to give an indication of the amount of hapten which was present 
in the sample. "Reporter molecule" also extends to use of cell agglutination or inhibition of 
10 agglutination such as red blood cells on latex beads, and the like. 

Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be chemically 
coupled to antibodies without altering their binding capacity. When activated by illumination 
with light of a particular wavelength, the fluoro chrome-labelled antibody adsorbs the light 

15 energy, inducing a state to excitability in the molecule, followed by emission of the light at a 
characteristic colour visually detectable with a light microscope. As in the EIA, the fluorescent 
labelled antibody is allowed to bind to the first antibody-hapten complex. After washing off the 
unbound reagent, the remaining tertiary complex is then exposed to the light of the appropriate 
wavelength the fluorescence observed indicates the presence of the hapten of interest. 

20 Immunofluorescence and EIA techniques are both very well established in the art and are 
particularly preferred for the present method. However, other reporter molecules, such as 
radioisotope, chemiluminescent or bioluminescent molecules, may also be employed. 

As stated above, the present invention extends to genetic constructs capable of encoding 
25 MCG4 or functional derivatives thereof. Such genetic constructs are also contemplated to be 
useful in modulating expression of specific genes in which mcg4 is involved in tissue-specific 
or temporal regulation. 

Accordingly, another aspect of the present invention is directed to a genetic construct 
30 comprising a nucleotide sequence encoding a peptide, polypeptide or protein and mcg4 or a 
functional derivative or homologue thereof capable of modulating the expression of said 
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nucleotide sequence. 

The present invention is further described with reference to the following non-limiting Figures 
and Examples. 

5 

In the Figures: 

Figure 1 is a representation of the nucleotide sequence and corresponding amino acid 
sequence of mcg4. 

10 

Figure 2 is a representation of the alignment of the human MCG4 amino acid sequence with 
a translation of a partial murine expressed sequence tag (EST). 

Figure 3 is a representation of the alignment of the human MCG4 amino acid sequence with 
15 a translation of a partial nematode EST. 

Figure 4 is a diagrammatic representation showing a predicted structure of MCG4. 

Figure 5 is a representation of sensitive sequence homology search of related cysteine- 
20 containing motifs in another Caenorhabditis elegans protein. 

Figure 6 is a representation showing that a related cysteine containing motif is present in the 
GATA-binding transcription factor from Saccharomyces pombe. 
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EXAMPLE 1 

A human gene (designated mcg4) was identified on chromosome llql3 that on the basis of 
sequence homology is predicted to encode a putative transcription factor of 310 amino acids 
5 (Fig. 1). mcg4 is transcribed as an — 1.6kb mRNA. 

EXAMPLE! 

The expressed sequence tag (EST) database contains partial sequence data for the murine (Fig. 
10 2) and nematode (Fig. 3) homologues. 

EXAMPLE 3 

MCG4 contains a sequence of cysteine residues within the N-terminal region of the protein 
15 that resembles zinc-fmger binding domains of a novel type, ie. (HQ)^ [Fig. 4]. 

EXAMPLE 4 

Sensitive sequence homology searches reveal that related cysteine-containing motifs are 
20 present in another C. elegans protein (Fig. 5) as well as the GATA-binding transcription 
factor from S, pombe (Fig. 6). 

EXAMPLES 

25 mcg4 will have commercial value due to its likelihood of encoding a novel transcription factor 
that is highly conserved amongst organisms, thus suggesting an integral role in gene 
regulation. mcg4 may also be involved in some way in tissue-specific or temporal regulation 
of certain genes, thus making it a potential target for modulating expression of those 
downstream effectors. 

30 
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Those skilled in the art will appreciate that the invention described herein is susceptible to 
variations and modifications other than those specifically described. It is to be understood that 
the invention includes all such variations and modifications. The invention also includes all of 
the steps, features, compositions and compounds referred to or indicated in this specification, 
5 individually or collectively, and any and all combinations of any two or more of said steps or 
features. 
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SEQUENCE LISTING 
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(i) APPLICANT: The Council of The Queensland Institute of Medical Research 

(ii) TITLE OF INVENTION: A NOVEL GENE AND USES THEREFOR 

(iii) NUMBER OF SEQUENCES: 2 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: DAVIES COLLISON CAVE 

(B) STREET: 1 LITTLE COLLINS STREET 
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(F) ZEP: 3000 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
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(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 
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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1242 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 30.. 959 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

TCAGTAAACA CAGAGACTGG GGATCGATC ATG GGG CTT TGT AAG TGC CCC AAG 

Met Gly Leu Cys Lys Cys Pro Lys 
1 5 



53 



AGA AAG GTG ACC AAC CTG TTC TGC TTC GAA CAT CGG GTC AAC GTC TGC 
Arg Lys Val Thr Asn Leu Phe Cys Phe Glu His Arg Val Asn Val Cys 
10 15 20 



101 



GAG CAC TGC CTG GTA GCC AAT CAC GCC AAG TGC ATC GTC CAG TCC TAC 
Glu His Cys Leu Val Ala Asn His Ala Lys Cys lie Val Gin Ser Tyr 
25 30 35 40 

CTG CAA TGG CTC CAA GAT AGC GAC TAC AAC CCC AAT TGC CGC CTG TGC 
Leu Gin Trp Leu Gin Asp Ser Asp Tyr Asn Pro Asn Cys Arg Leu Cys 
45 50 55 



149 



197 



AAC ATA CCC CTG GCC AGC CGA GAG ACG ACC CGC CTT GTC TGC TAT GAT 
Asn He Pro Leu Ala Ser Arg Glu Thr Thr Arg Leu Val Cys Tyr Asp 
60 65 70 



245 



CTC TTT CAC TGG GCC TGC CTC AAT GAA CGT GCT GCC CAG CTA CCC CGA 
Leu Phe His Trp Ala Cys Leu Asn Glu Arg Ala Ala Gin Leu Pro Arg 
75 80 85 



293 



AAC ACG GCA CCT GCC GGC TAT CAG TGC CCC AGC TGC AAT GGC CCC ATC 
Asn Thr Ala Pro Ala Gly Tyr Gin Cys Pro Ser Cys Asn Gly Pro He 
90 95 100 



341 



TTC CCC CCA ACC AAC CTG GCT GGC CCC GTG GCC TCC GCA CTG AGA GAG 
Phe Pro Pro Thr Asn Leu Ala Gly Pro Val Ala Ser Ala Leu Arg Glu 
105 110 115 120 



389 



AAG CTG GCC ACA GTC AAC TGG GCC CGG GCA GGA CTG GGC CTC CCT CTG 
Lys Leu Ala Thr Val Asn Trp Ala Arg Ala Gly Leu Gly Leu Pro Leu 
125 130 135 



437 



ATC GAT GAG GTG GTG AGC CCA GAG CCC GAG CCC CTC AAC ACG TCT GAC 
He Asp Glu Val Val Ser Pro Glu Pro Glu Pro Leu Asn Thr Ser Asp 
140 145 150 



485 



TTC TCT GAC TGG TCT AGT TTT AAT GCC AGC AGT ACC CCT GGA CCA GAG 
Phe Ser Asp Trp Ser Ser Phe Asn Ala Ser Ser Thr Pro Gly Pro Glu 
155 160 165 



533 



GAG GTA GAC AGC GCC TCT GCT GCC CCA GCC TTC TAC AGC CAG GCC CCC 
Glu Val Asp Ser Ala Ser Ala Ala Pro Ala Phe Tyr Ser Gin Ala Pro 
170 175 180 



581 
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CGG CCC CCA GCT TCC CCA GGC CGG CCC GAG CAG CAC ACA GTG ATC CAC 
Arg Pro Pro Ala Ser Pro Gly Arg Pro Glu Gin His Thr Val lie His 
185 190 195 200 

ATG GGC AAT CCT GAG CCC TTG ACT CAC GCC CCT AGG AAG GTG TAT GAT 
Met Gly Asn Pro Glu Pro Leu Thr His Ala Pro Arg Lys Val Tyr Asp 
205 210 215 

ACG CGG GAT GAT GAC CGG ACA CCA GGC CTC CAT GGA GAC TGT GAC GAT 
Thr Arg Asp Asp Asp Arg Thr Pro Gly Leu His Gly Asp Cys Asp Asp 
220 225 230 

GAC AAG TAC CGA CGT CGG CCG GCC TTG GGT TGG CTG GCC CGG CTG CTA 
Asp Lys Tyr Arg Arg Arg Pro Ala Leu Gly Trp Leu Ala Arg Leu Leu 
235 240 245 

AGG AGC CGG GCT GGG TCT CGG AAG CGA CCG CTG ACC CTG CTC CAG CGG 
Arg Ser Arg Ala Gly Ser Arg Lys Arg Pro Leu Thr Leu Leu Gin Arg 
250 255 260 

GCG GGG CTG CTG CTA CTC TTG GGA CTG CTG GGC TTC CTG GCC CTC CTT 
Ala Gly Leu Leu Leu Leu Leu Gly Leu Leu Gly Phe Leu Ala Leu Leu 
265 270 275 280 

GCC CTC ATG TCT CGC CTA GGC CGG GCC GCA GCT GAC AGC GAT CCC AAC 
Ala Leu Met Ser Arg Leu Gly Arg Ala Ala Ala Asp Ser Asp Pro Asn 
285 290 295 

CTG GAC CCA CTC ATG AAC CCT CAC ATC CGC GTG GGC CCC TCC TGA 
Leu Asp Pro Leu Met Asn Pro His lie Arg Val Gly Pro Ser * 
300 305 310 

GCCCCCTTGC TTGTGGCTAG GCCAGCCTAG GATGTGGGTT CTGTGGAGGA GAGGCGGGGT 

AATGGGGAGG CTGAGGGCAC CTCTTCACTG CCCCTCTCCC TCAAGCCTAA GACACTAAGA 

CCCCAGACCC AAAGCCAAGT CCACCAGAGT GGCTCGCAGG CCAGGCCTGG AGTCCCCGTG 

GGTCAAGCAT TTGTCTTGAC TTGCTTTCTC CCGGGTCTCC AGCCTCCGAC CCCTCGCCCC 

ATGAAGGAGC TGGCAGGTGG AAATAAACAA CAACTTTATT 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Gly Leu Cys Lys Cys Pro Lys Arg Lys Val Thr Asn Leu Phe Cys 
15 10 15 

Phe Glu His Arg Val Asn Val Cys Glu His Cys Leu Val Ala Asn His 
20 25 30 

Ala Lys Cys lie Val Gin Ser Tyr Leu Gin Trp Leu Gin Asp Ser Asp 
35 40 45 

Tyr Asn Pro Asn Cys Arg Leu Cys Asn lie Pro Leu Ala Ser Arg Glu 
50 55 60 
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Thr Thr Arg Leu Val Cys Tyr Asp Leu Phe His Trp Ala Cys Leu Asn 
65 70 75 80 

Glu Arg Ala Ala Gin Leu Pro Arg Asn Thr Ala Pro Ala Gly Tyr Gin 
85 90 95 

Cys Pro Ser Cys Asn Gly Pro He Phe Pro Pro Thr Asn Leu Ala Gly 
100 105 110 

Pro Val Ala Ser Ala Leu Arg Glu Lys Leu Ala Thr Val Asn Trp Ala 
115 120 125 

Arg Ala Gly Leu Gly Leu Pro Leu He Asp Glu Val Val Ser Pro Glu 
130 135 140 

Pro Glu Pro Leu Asn Thr Ser Asp Phe Ser Asp Trp Ser Ser Phe Asn 
145 150 155 160 

Ala Ser Ser Thr Pro Gly Pro Glu Glu Val Asp Ser Ala Ser Ala Ala 
165 170 175 

Pro Ala Phe Tyr Ser Gin Ala Pro Arg Pro Pro Ala Ser Pro Gly Arg 
180 185 190 

Pro Glu Gin His Thr Val He His Met Gly Asn Pro Glu Pro Leu Thr 
195 200 205 

His Ala Pro Arg Lys Val Tyr Asp Thr Arg Asp Asp Asp Arg Thr Pro 
210 215 220 

Gly Leu His Gly Asp Cys Asp Asp Asp Lys Tyr Arg Arg Arg Pro Ala 
225 230 235 240 

Leu Gly Trp Leu Ala Arg Leu Leu Arg Ser Arg Ala Gly Ser Arg Lys 
245 250 255 

Arg Pro Leu Thr Leu Leu Gin Arg Ala Gly Leu Leu Leu Leu Leu Gly 
260 265 270 

Leu Leu Gly Phe Leu Ala Leu Leu Ala Leu Met Ser Arg Leu Gly Arg 
275 280 285 

Ala Ala Ala Asp Ser Asp Pro Asn Leu Asp Pro Leu Met Asn Pro His 
290 295 300 

He Arg Val Gly Pro Ser 
305 310 



DATED this 23 rd day of May 1997 

The Council of The Queensland Institute of Medical Research 
By DAVIES COLLISON CAVE 
Patent Attorneys for the Applicants 



FIGURE 1 



TCAGTAAACA CAGAGACTGG GGATCGATC ATG GGG CTT TGT AAG TGC CCC AAG 53 

Met Gly Leu Cys Lys Cys Pro Lys 
1 5 

AGA AAG GTG ACC AAC CTG TTC TGC TTC GAA CAT CGG GTC AAC GTC TGC 101 
Arg Lys Val Thr Asn Leu Phe Cys Phe Glu His Arg Val Asn Val Cys 
10 15 20 

GAG CAC TGC CTG GTA GCC AAT CAC GCC AAG TGC ATC GTC CAG TCC TAC 14 9 

Glu His Cys Leu Val Ala Asn His Ala Lys Cys lie Val Gin Ser Tyr 
25 30 35 40 

CTG CAA TGG CTC CAA GAT AGC GAC TAC AAC CCC AAT TGC CGC CTG TGC 197 
Leu Gin Trp Leu Gin Asp Ser Asp Tyr Asn Pro Asn Cys Arg Leu Cys 
45 50 55 

AAC ATA CCC CTG GCC AGC CGA GAG ACG ACC CGC CTT GTC TGC TAT GAT 24 5 

Asn He Pro Leu Ala Ser Arg Glu Thr Thr Arg Leu Val Cys Tyr Asp 
60 65 70 

CTC TTT CAC TGG GCC TGC CTC AAT GAA CGT GCT GCC CAG CTA CCC CGA 2 93 

Leu Phe His Trp Ala Cys Leu Asn Glu Arg Ala Ala Gin Leu Pro Arg 
75 80 85 

AAC ACG GCA CCT GCC GGC TAT CAG TGC CCC AGC TGC AAT GGC CCC ATC 341 
Asn Thr Ala Pro Ala Gly Tyr Gin Cys Pro Ser Cys Asn Gly Pro He 
90 95 100 

TTC CCC CCA ACC AAC CTG GCT GGC CCC GTG GCC TCC GCA CTG AGA GAG 3 89 

Phe Pro Pro Thr Asn Leu Ala Gly Pro Val Ala Ser Ala Leu Arg Glu 
105 110 115 120 

AAG CTG GCC ACA GTC AAC TGG GCC CGG GCA GGA CTG GGC CTC CCT CTG 437 
Lys Leu Ala Thr Val Asn Trp Ala Arg Ala Gly Leu Gly Leu Pro Leu 
125 130 135 

ATC GAT GAG GTG GTG AGC CCA GAG CCC GAG CCC CTC AAC ACG TCT GAC 485 
He Asp Glu Val Val Ser Pro Glu Pro Glu Pro Leu Asn Thr Ser Asp 
140 145 150 

TTC TCT GAC TGG TCT AGT TTT AAT GCC AGC AGT ACC CCT GGA CCA GAG 53 3 

Phe Ser Asp Trp Ser Ser Phe Asn Ala Ser Ser Thr Pro Gly Pro Glu 
155 160 165 

GAG GTA GAC AGC GCC TCT GCT GCC CCA GCC TTC TAC AGC CAG GCC CCC 581 
Glu Val Asp Ser Ala Ser Ala Ala Pro Ala Phe Tyr Ser Gin Ala Pro 
170 175 180 

CGG CCC CCA GCT TCC CCA GGC CGG CCC GAG CAG CAC ACA GTG ATC CAC 62 9 

Arg Pro Pro Ala Ser Pro Gly Arg Pro Glu Gin His Thr Val He His 
185 190 195 200 

ATG GGC AAT CCT GAG CCC TTG ACT CAC GCC CCT AGG AAG GTG TAT GAT 677 
Met Gly Asn Pro Glu Pro Leu Thr His Ala Pro Arg Lys Val Tyr Asp 
205 210 215 



Figure 1 (continued) 

ACG CGG GAT GAT GAG CGG ACA CCA GGC CTC CAT GGA GAG TGT GAG GAT 725 
Thr Arg Asp Asp Asp Arg Thr Pro Gly Leu His Gly Asp Cys Asp Asp 
220 225 230 

GAG AAG TAG GGA CGT CGG CGG GGC TTG GGT TGG CTG GCC CGG GTG CTA 773 
Asp Lys Tyr Arg Arg Arg Pro Ala Leu Gly Trp Leu Ala Arg Leu Leu 
235 240 245 

AGG AGG CGG GGT GGG TCT CGG AAG GGA CGG CTG ACG CTG CTC GAG CGG 821 
Arg Ser Arg Ala Gly Ser Arg Lys Arg Pro Leu Thr Leu Leu Gin Arg 
250 255 260 

GGG GGG CTG CTG CTA CTC TTG GGA CTG CTG GGC TTG CTG GCC CTC CTT 869 
Ala Gly Leu Leu Leu Leu Leu Gly Leu Leu Gly Phe Leu Ala Leu Leu 
265 270 275 280 

GCC CTC ATG TCT CGC CTA GGC CGG GCC GGA GGT GAC AGG GAT GCC AAC 917 
Ala Leu Met Ser Arg Leu Gly Arg Ala Ala Ala Asp Ser Asp Pro Asn 
285 290 295 

CTG GAC CCA CTC ATG AAC CGT GAC ATC CGC GTG GGC GCC TCC TGA 962 
Leu Asp Pro Leu Met Asn Pro His lie Arg Val Gly Pro Ser * 
300 305 310 
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Figure 2 



gb| AA155210 |AA155210 nur98e01.rl Stratagene mouse oiODryonic car.cinoma 
(#937317) Mus musculus cDNA clone 605496 5' 

Query: 1 MGLCKCPKRKVTbnJ^CFEHRVI^CEHCLVANHAKCrVQSyi^ 60 

MGI^KCPKRKVT^ILFCFEHRV^A/CEHCLVANHAKC^VQSYI^W^ PL 
Sbjct: 98 MGUrKCPKRKVTl^FCFEHRVNVCraCl.VANHAKCrVQSY^^ 277 

Figure 3 

dbj |D75913 |CELK111G3F C.elegans cDNA clone yklllgS : 5' end, single read. 

Query: 7 PKRK\miIJ^CFEHRVWCEHCLVANHAKCIVQSYLQWIX2^^ 66 

PKRKVTNLF +EHRVNVCE LV NH C+VQSYL WL D DY+PNC LC L +T 
Sbjct: 1 PKRKVT^^.FXYEHRVWCELXLVDi^ 180 

Query: 67 RLVCYDLFHWACLNERAAQLPRNTAPAGYQCP 98 98 PSCNGPIFPPNQ 109 

RL C L HW C +E P TAP GY+CP P C+ +FPP-M2 

Sbjct: 181 RLNCLHLLHWKCFDEWXGNFPOTTAPXGYRCP 276 275 PCCSQEVFPPDQ 310 



f 

Figure 4 




Figure 5 



sp|P46580|YI£5_CAEEL HYPOTHETICAL 146.8 PCD PROTEIN C34E10.5 IN 

CHROMOSOME III gi | 500728 (U10402) C34E10.5 gene product 
[Caenorhabditis elegans] 

Query: 56 CNIPLASRETTRLVCYDIJTMACLNERAAQLPRITrAPAGYQCPSC 100 

C+IL++ + LC LFWC+EA + + + +CP C 

Sbjct: 1222 CSICLENKNPSALFCGHLFCWTCIQEHAVAATSSASTSSARCPQC 1266 



Figure 6 

gi 1 703468 (L29051) homologous to GATA-binding transcription factor 
[ Schizosaccharomyces pombe] 

Query: 35 CIVQSyLQWLQDSDYrn>rJCRIXn^I 58 

C + +W +D NP C C * 
Sbjct: 175 CATTNTPKWRRDESGNPXCNACGL 198 

Query: 162 SSTPGPEEVDSASAAPAFYSQAPRPPASPGRPEQHTVIH^K5NPEPLTHAPRKVYDTRDDD 221 

+ S PEE S S S P-t- SP + +Q +1 P +V + D 

Sbjct: 441 AS1J^NPEEPPSNSD?:0PST1SNGPKSEVSPSQSQQAPLIQSSTSPVSLQFPPEVQGSNVDK 500 



Query: 
Sbjct: 



222 RTPGLH 227 

R L^- 
501 RNYALN 506 



